Configuration Management and Open Source Projects
نویسنده
چکیده
Configuration management tools are at the heart of every software project. Thus, it should not be surprising that they play a central role in Open Source projects as well. Most prominent in use is CVS, which is—indeed—an Open Source system in its own right. In this position paper we examine why CVS plays such a major role in the management of Open Source projects. Furthermore, we raise some areas in which we believe CVS should be improved, both in the short and long term. Concurrent Versions Systems As one of the essential tools needed during the development of a software product, configuration management tools were among the first faced with the reality of having to operate in a distributed setting. In response, many distributed CM systems have been developed in the past few years (e.g., ClearCase MultiSite [1], Gradient [3], ScmEngine [5], DISC [12], WWCM [13], DSCS [16], Perforce [17]). Despite this broad availability, it is clear that a single system has emerged as the de-facto configuration management system used in Open Source projects. In fact, this CM system, CVS [4], has been adopted by almost every major Open Source project. Further evidencing its popularity is the fact that CVS is the only CM system that has a book dedicated to its use in Open Source projects [9]. Several reasons can be identified for this intriguing phenomenon. • The CM policy embedded in CVS closely matches the Open Source process. An Open Source projects is typically organized around a central repository from which individual developers retrieve copies of the project. Developers make their changes within one of these copies. Once the changes are complete, they update the repository with new versions of the artifacts that have changed. Other developers synchronize their copies of the project by periodically downloading updated versions of artifacts and resolving any conflicts that may exist. This is exactly the CM policy that CVS excels in supporting: CVS is based on a single level of transactions that each are based on an optimistic scheme of conflict resolution. This one-to-one correspondence between the actual Open Source process and the process supported by CVS makes it very appealing to use CVS in Open Source projects. • CVS supports decentralized software development. Since Open Source projects involve developers that are located all over the world, a requirement for a configuration management tool to be used in an Open Source project is that the tool operates in a decentralized and distributed setting. Preferably, even, the tool supports intermittent disconnected operation in that each developer does not continuously have to be connected to a main repository with artifacts. Although initially not devised as a distributed CM system, CVS has been enhanced over the years to provide several methods of access to its central repository with artifacts. Together with its optimistic method of resolving conflicts, CVS, thus, precisely matches the distributed capabilities needed in an Open Source project • CVS is free, yet well maintained. Since Open Source projects typically have little to no funding and commercial CM systems tend to be rather expensive in nature, the use of a commercial CM system in an Open Source project is usually impossible. CVS is free. Despite being free, however, CVS is well maintained and rather complete in functionality when compared to the other freely available CM systems. In fact, CVS is an Open Source project itself and has been in widespread use for many years now. As a result, many of its initial problems have been solved and CVS is currently one of the best freely available CM systems. Given this combination of factors, it should come as no surprise that CVS is so widely used in Open Source projects. It provides the necessary functionality at a more than reasonable price: it is free, yet easy to install, setup, learn, and use. Potential Short-Term Enhancements to CVS With its unparalleled success, complacency seems to have settled into CVS and the functionality that it provides to its users. In fact, the functionality and CM policy that form the core of CVS have not been enhanced for quite some time now. Unfortunately, a close examination of the Open Source process seems to indicate that several enhancements to CVS could greatly enhance the applicability of CVS in the future to come. Specifically, we believe that the following four enhancements are important to be made in the near future. • An infrastructure that supports multiple repositories. More and more Open Source projects are based upon other pieces of software from other Open Source projects. Currently, these pieces of software need to be periodically incorporated via the vendor-code management functions of CVS. Although certainly usable, this solution becomes unwieldy if many subcomponents are present that each have a different release schedule. It would be preferable to link various CVS repositories together to directly and continuously import source code for subcomponents. In essence, this brings an automated and enhanced version of a tool like SRM [20] to the software development process. • Versioning of directories. One obvious improvement to be made to CVS is its handling of directory versioning. Currently, directories are not versioned at all, even though their contents can change over time. This not only leads to a rather crude way of handling these types of changes, but also to an overuse of tags to label the various configurations in which a project may exist over time. Given that more and more Open Source projects create a large number of configurations and regularly reorganize their project structure, this is a rather serious problem that deserves immediate attention. Fortunately, the wellknown solution of versioning directories solves this problem: it provides a clean way of dealing with the changing content of a directory and it provides a convenient and natural way of dealing with configurations. This is demonstrated by, for example, PRCS [14] and COOP/Orm [15], both of which are CM systems that intrinsically support the versioning of directories. • Private versioning capabilities. Despite the fact that developers may store intermediate versions of artifacts in a CVS repository, it is generally encouraged that only complete and working changes are committed. Therefore, developers are left without any versioning support in their private workspaces. As Open Source projects are becoming larger and changes more complex, such a capability is much needed. As demonstrated, for example, by Continuus [6] and Perforce [17], the availability of such functionality enhances the development experience and typically leads to the creation of many intermediate versions before the final changes are stored in the main repository. These intermediate versions remain private to a developer and they neither interfere with changes from others, nor clutter the version history in the main repository. • Repository Replication. It is common to use CVS in combination with a replication program like rsync [19] to improve access times for developers that are physically located in different continents than the main project repository. Although certainly beneficial, this solution has the problem that conflicts arising during synchronization cannot be resolved by rsync. Instead, the synchronization fails and manual intervention is needed to integrate and merge the changes from developers that use different instances of a replicated repository. Since rsync and CVS share many pieces of functionality, it should be possible to merge both into an integrated solution that resolves conflicts during synchronization in the same way CVS resolves conflicts during regular development. This would lead to a solution like ClearCase MultiSite [1]. It should be observed that each of these enhancements is based on solutions that already exist in commercial CM systems. They, in effect, can be seen as bringing CVS up-to-date with some of the advanced functionality that not only has emerged in today's CM systems, but also has proven to be very beneficial. It should also be noted that none of the above suggestions involves changing the core policy or functionality of CVS. The basic premise of a transaction-oriented CM system that resolves conflicts in an optimistic way remains. The functionality suggested merely increases the applicability and utility of CVS, it does not change its fundamental principles. Potential Long-Term Enhancements to CVS Even with the short-term enhancements suggested in the previous section, it remains an open question as to how long CVS, in its current incarnation, will survive as the myriad of CM systems that are available continue to evolve and incorporate more advanced functionality. Therefore, it may be time to look into a complete redesign of CVS that radically advances its functionality. In fact, we believe it is possible to leapfrog most of the existing CM systems in terms of functionality and popularity if a new, reincarnated CVS supports a complete cycle of the Open Source process and not just version control. In particular, we suggest that CVS be redesigned to include not only the changes suggested in the previous section, but also changes that lead to the incorporation of such activities as release management (automatically packaging software, creating and maintaining a change log, and publishing a package on a Web site), bug tracking (filing bug reports, keeping an archive of resolved bugs, and associating bug reports to those versions of the source code that fix each bug), and deployment (installing the software at the client side, periodically polling for updates, and actually upgrading the version of the software on the installed base). Although ambitious, several pieces of infrastructure exist that have proven to be beneficial in their respective domains and that may help in realizing the vision of an integrated and rejuvenated CVS. • SRM. SRM is a software release management system that manages software releases stemming from multiple different sites [18,20]. SRM integrates the development process with the deployment process. It supports groups of distributed software development organizations with a simple release process that hides distribution. Specifically, developers are supported by allowing the specification of cross-site dependencies and users are supported by allowing the retrieval, via the Web, of a system of systems in a single step and as a single package. • Software Dock. The Software Dock is a deployment system that manages software systems after they have been released [10,11]. Based on precise deployment instructions that are captured in a Deployable Software Description (DSD), specific agents install, update, and reconfigure software at a consumer site. The DSD is expressive enough to be able to handle dependencies among software, even if the software stems from different release sites. • RPM. RPM [2] is a system that can be viewed as an intermediate between SRM and the Software Dock. Although not as fully functional as the Software Dock in managing a deployed software system, it is more advanced in its integration between the release site (similar to SRM) and the customer site (similar to the Software Dock). In addition to these systems, others can be adapted for bug-tracking (such as GNATS [7]), repository synchronization (such as rsync [19]), and providing a distributed CM infrastructure (such as NUCM [21,22] or Adele [8]). Basically, our vision is for a component-based, fully distributed and decentralized configuration management system that manages artifacts from their incarnation to their eventual destination as an installed piece of software at a consumer site. Although much work needs to be done to fulfill this vision, we believe it would be a great advance, not only for the functionality of CVS, but also for the Open Source community at large which will be served with a CM system that intimately supports its software process.
منابع مشابه
A study of configuration management in open source software projects
Projects where developers are geographically distributed and with high personnel turnover are usually considered to be hard to manage. Any organisation that successfully handles such projects merits closer analysis so that lessons can be learned and good practice disseminated. Open Source Software projects represent such a case. One important factor is good configuration management practices. I...
متن کاملSoftware Quality Assessment of Open Source Software
The open source software ecosystem comprises more than a hundred thousand applications of varying quality. Individuals and organizations wishing to use open source software packages have scarce objective data to evaluate their quality. However, open source development projects by definition allow anybody to read, and therefore evaluate their source code. In addition, most projects also publish ...
متن کاملConfiguration Management for Open Source Software
Any organisation that produces high quality software merits a closer analysis of their methods such that good techniques can be transferred to other organisations. Open Source Software projects is such a case. We make explicit their underlying process for handling change management and analyse to what extent their success can be attributed to good process, tools or people. Furthermore, we discu...
متن کاملEvolutionary Success of Open Source Software: an Investigation into Exogenous Drivers
The “success” of a Free/Libre/Open Source Software (FLOSS) project has often been evaluated through the number of commits made to its configuration management system, number of developers and number of users. Based on SourceForge, most studies have concluded that the vast majority of projects are failures. This paper argues that the relative success of a FLOSS project depends also on the chosen...
متن کاملAccess and Integrity Control in a Public-Access, High-Assurance Configuration Management System
OpenCM is a new configuration management system created to support high-assurance development in open-source projects. Because OpenCM is designed as an open source tool, robust replication support is essential, and security requirements are somewhat unusual – preservation of access is as important as prevention. Also, integrity preservation is a primary focus of the information architecture. Be...
متن کاملIdentifying exogenous drivers and evolutionary stages in FLOSS projects
The success of a Free/Libre/Open Source Software (FLOSS) project has been evaluated in the past through the number of commits made to its configuration management system, number of developers and number of users. Most studies, based on a popular FLOSS repository (SourceForge), have concluded that the vast majority of projects are failures. This study’s empirical results confirm and expand concl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000